1,451 research outputs found

    Efficiency and Power as a Function of Sequence Coverage, SNP Array Density, and Imputation

    Get PDF
    High coverage whole genome sequencing provides near complete information about genetic variation. However, other technologies can be more efficient in some settings by (a) reducing redundant coverage within samples and (b) exploiting patterns of genetic variation across samples. To characterize as many samples as possible, many genetic studies therefore employ lower coverage sequencing or SNP array genotyping coupled to statistical imputation. To compare these approaches individually and in conjunction, we developed a statistical framework to estimate genotypes jointly from sequence reads, array intensities, and imputation. In European samples, we find similar sensitivity (89%) and specificity (99.6%) from imputation with either 1× sequencing or 1 M SNP arrays. Sensitivity is increased, particularly for low-frequency polymorphisms (MAF <5%), when low coverage sequence reads are added to dense genome-wide SNP arrays — the converse, however, is not true. At sites where sequence reads and array intensities produce different sample genotypes, joint analysis reduces genotype errors and identifies novel error modes. Our joint framework informs the use of next-generation sequencing in genome wide association studies and supports development of improved methods for genotype calling

    Illuminating Choices for Library Prep: A Comparison of Library Preparation Methods for Whole Genome Sequencing of Cryptococcus neoformans Using Illumina HiSeq.

    Get PDF
    The industry of next-generation sequencing is constantly evolving, with novel library preparation methods and new sequencing machines being released by the major sequencing technology companies annually. The Illumina TruSeq v2 library preparation method was the most widely used kit and the market leader; however, it has now been discontinued, and in 2013 was replaced by the TruSeq Nano and TruSeq PCR-free methods, leaving a gap in knowledge regarding which is the most appropriate library preparation method to use. Here, we used isolates from the pathogenic fungi Cryptococcus neoformans var. grubii and sequenced them using the existing TruSeq DNA v2 kit (Illumina), along with two new kits: the TruSeq Nano DNA kit (Illumina) and the NEBNext Ultra DNA kit (New England Biolabs) to provide a comparison. Compared to the original TruSeq DNA v2 kit, both newer kits gave equivalent or better sequencing data, with increased coverage. When comparing the two newer kits, we found little difference in cost and workflow, with the NEBNext Ultra both slightly cheaper and faster than the TruSeq Nano. However, the quality of data generated using the TruSeq Nano DNA kit was superior due to higher coverage at regions of low GC content, and more SNPs identified. Researchers should therefore evaluate their resources and the type of application (and hence data quality) being considered when ultimately deciding on which library prep method to use

    Simulations of energetic beam deposition: from picoseconds to seconds

    Full text link
    We present a new method for simulating crystal growth by energetic beam deposition. The method combines a Kinetic Monte-Carlo simulation for the thermal surface diffusion with a small scale molecular dynamics simulation of every single deposition event. We have implemented the method using the effective medium theory as a model potential for the atomic interactions, and present simulations for Ag/Ag(111) and Pt/Pt(111) for incoming energies up to 35 eV. The method is capable of following the growth of several monolayers at realistic growth rates of 1 monolayer per second, correctly accounting for both energy-induced atomic mobility and thermal surface diffusion. We find that the energy influences island and step densities and can induce layer-by-layer growth. We find an optimal energy for layer-by-layer growth (25 eV for Ag), which correlates with where the net impact-induced downward interlayer transport is at a maximum. A high step density is needed for energy induced layer-by-layer growth, hence the effect dies away at increased temperatures, where thermal surface diffusion reduces the step density. As part of the development of the method, we present molecular dynamics simulations of single atom-surface collisions on flat parts of the surface and near straight steps, we identify microscopic mechanisms by which the energy influences the growth, and we discuss the nature of the energy-induced atomic mobility

    The variant call format and VCFtools

    Get PDF
    Summary: The variant call format (VCF) is a generic format for storing DNA polymorphism data such as SNPs, insertions, deletions and structural variants, together with rich annotations. VCF is usually stored in a compressed manner and can be indexed for fast data retrieval of variants from a range of positions on the reference genome. The format was developed for the 1000 Genomes Project, and has also been adopted by other projects such as UK10K, dbSNP and the NHLBI Exome Project. VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API

    Rotational Excitation of HC_3N by H_2 and He at low temperatures

    Full text link
    Rates for rotational excitation of HC3N by collisions with He atoms and H2 molecules are computed for kinetic temperatures in the range 5-20K and 5-100K, respectively. These rates are obtained from extensive quantum and quasi-classical calculations using new accurate potential energy surfaces (PES)

    Quantifying single nucleotide variant detection sensitivity in exome sequencing

    Get PDF
    BACKGROUND: The targeted capture and sequencing of genomic regions has rapidly demonstrated its utility in genetic studies. Inherent in this technology is considerable heterogeneity of target coverage and this is expected to systematically impact our sensitivity to detect genuine polymorphisms. To fully interpret the polymorphisms identified in a genetic study it is often essential to both detect polymorphisms and to understand where and with what probability real polymorphisms may have been missed. RESULTS: Using down-sampling of 30 deeply sequenced exomes and a set of gold-standard single nucleotide variant (SNV) genotype calls for each sample, we developed an empirical model relating the read depth at a polymorphic site to the probability of calling the correct genotype at that site. We find that measured sensitivity in SNV detection is substantially worse than that predicted from the naive expectation of sampling from a binomial. This calibrated model allows us to produce single nucleotide resolution SNV sensitivity estimates which can be merged to give summary sensitivity measures for any arbitrary partition of the target sequences (nucleotide, exon, gene, pathway, exome). These metrics are directly comparable between platforms and can be combined between samples to give “power estimates” for an entire study. We estimate a local read depth of 13X is required to detect the alleles and genotype of a heterozygous SNV 95% of the time, but only 3X for a homozygous SNV. At a mean on-target read depth of 20X, commonly used for rare disease exome sequencing studies, we predict 5–15% of heterozygous and 1–4% of homozygous SNVs in the targeted regions will be missed. CONCLUSIONS: Non-reference alleles in the heterozygote state have a high chance of being missed when commonly applied read coverage thresholds are used despite the widely held assumption that there is good polymorphism detection at these coverage levels. Such alleles are likely to be of functional importance in population based studies of rare diseases, somatic mutations in cancer and explaining the “missing heritability” of quantitative traits

    Ab initio molecular dynamics using density based energy functionals: application to ground state geometries of some small clusters

    Get PDF
    The ground state geometries of some small clusters have been obtained via ab initio molecular dynamical simulations by employing density based energy functionals. The approximate kinetic energy functionals that have been employed are the standard Thomas-Fermi (TTF)(T_{TF}) along with the Weizsacker correction TWT_W and a combination F(Ne)TTF+TWF(N_e)T_{TF} + T_W. It is shown that the functional involving F(Ne)F(N_e) gives superior charge densities and bondlengths over the standard functional. Apart from dimers and trimers of Na, Mg, Al, Li, Si, equilibrium geometries for LinAl,n=1,8Li_nAl, n=1,8 and Al13Al_{13} clusters have also been reported. For all the clusters investigated, the method yields the ground state geometries with the correct symmetries with bondlengths within 5\% when compared with the corresponding results obtained via full orbital based Kohn-Sham method. The method is fast and a promising one to study the ground state geometries of large clusters.Comment: 15 pages, 3 PS figure
    corecore